Improving Data Quality in Data Warehousing Applications

نویسندگان

  • Lin Li
  • Taoxin Peng
  • Jessie Kennedy
چکیده

There is a growing awareness that high quality of data is a key to today‟s business success and dirty data that exits within data sources is one of the reasons that cause poor data quality. To ensure high quality, enterprises need to have a process, methodologies and resources to monitor and analyze the quality of data, methodologies for preventing and/or detecting and repairing dirty data. However in practice, detecting and cleaning all the dirty data that exists in all data sources is quite expensive and unrealistic. The cost of cleaning dirty data needs to be considered for most of enterprises. Therefore conflicts may arise if an organization intends to clean their data warehouses in that how do they select the most important data to clean based on their business requirements. In this paper, business rules are used to classify dirty data types based on data quality dimensions. The proposed method will be able to help to solve this problem by allowing users to select the appropriate group of dirty data types based on the priority of their business requirements. It also provides guidelines for measuring the data quality with respect to different data quality dimensions and also will be helpful for the development of data cleaning tools.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

New Trends in Data Warehousing and Data Analysis

Title Type new trends in data warehousing and data analysis annals of information systems PDF data and information quality dimensions principles and techniques data-centric systems and applications PDF information security risk assessment toolkit practical assessments through data collection and data analysis PDF web data mining exploring hyperlinks contents and usage data data-centric systems ...

متن کامل

Preface Chapter 2, " Dynamic Workload for Schema Evolution in Data Warehouses: a Performance Issue, "

Data warehousing and knowledge discovery are established key technologies in many application domains. Enterprises and organizations improve their abilities in data analysis, decision support, and the automatic extraction of knowledge from data; for scientific applications to analyze collected data, for medical applications for quality assurance and for steps to individualized medicine, to ment...

متن کامل

An Empirical Investigation of the Impact of Data Quality and its Antecedents on Data Warehousing

Data warehousing is a topic of great interest in the business community, due to increasing business intelligence demands, coupled with increased data availability and processing capability. Despite large financial backing of data warehousing implementations, many fail. Little research has been conducted pertaining to data warehousing success. Traditional system success models (DeLone and McLean...

متن کامل

An Object-oriented Quality Framework with Optimization Models for Managing Data Quality in Data Warehouse Applications

⎯Data quality is an important issue, especially in large-scale data applications such as data warehousing (DW). The validity (a super quality type specialized by accuracy, completeness, consistency, and currency) of data in fact has corresponding impacts on ad-hoc decisions. To ensure quality, improvement actions such as edit check, imputation, and audit et al. are applied. Yet these utilize an...

متن کامل

Network Resource Management for Improving Users Quality of experience in Software Defined Network by Weighted Fuzzy Petri-NetMethod

The rapid rise in popularity of multimedia applications, such as VoIP, IPTV and Video Conferencing, intensifies the need to consider resource management for user satisfaction. Furthermore, improving Quality of Experience (QoE) in Software Defined Networks (SDNs) services is one of the important issues to be addressed by provisioning optimum resource management. In this paper, resource allocatio...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010